Modeling Prosody Pattern of Chinese Expressive Speech and Its Application in Personalized Speech Conversion

نویسندگان

  • Zhang Zhang
  • Zhiyong Wu
  • Jia Jia
  • Lianhong Cai
چکیده

This paper proposes an approach for modeling prosody patterns of acoustic features of Chinese expressive speech. In a Chinese multi-syllabic prosodic word, a syllable is identified as the core syllable based on the observation that speaker usually puts more emphasis on such syllable. The variations of the acoustic features migrating from neutral to expressive speech are then analyzed for both the core and non-core syllables. It is found that the acoustic variations of the core syllable are the most significant; the variations of the non-core syllables are influenced by the core syllable; such influence decreases while the non-core syllable moves farther from the core syllable. A double-layer perturbation model is then proposed to model such prosody patterns, which is further applied to generate personalized prosody patterns for personalized speech conversion. Experimental results indicate that our model can catch and regenerate the personality of prosodic features in expressive speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Modeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki’s Model and Structural Model

Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...

متن کامل

Structural Modeling of Fundamental Frequency Contour for Thai Expressive Speech

Problem statement: Appropriate modeling of fundamental Frequency (F0) contour for speech is a key factor to preserve the quality of speech prosody. One successful approach has been conducted for tonal language of Mandarin Chinese. It is based on the assumption that the behavioral characteristics of vocal-fold elongation in vibration could be approximated by those of a simple forced vibrating sy...

متن کامل

Emotion conversion using Feedforward Neural Networks

An emotion is made of several components such as physiological changes in the body, subjective feelings, and expressive behaviours. These changes in speech signal are mainly observed in prosody parameters such as pitch, duration and energy. In this work, prosody parameters are modified using instants of significant excitation (epochs) and these instants are detected using Zero Frequency Filteri...

متن کامل

Prosody generation in Chinese synthesis using the template of quantified prosodic unit and base intonation contour

This paper presents a prosody generation method for Chinese mandarin using the template of quantified prosodic unit and base intonation contour. This method uses the prosodic feature picked-up from the syllables in the prosody words by rule as the base unit, and integrates the prosody rules in the prosody words of Chinese mandarin and base intonation contour to achieve the prosody contours with...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012